An Efficient Index Lattice for XML Query Optimization
نویسندگان
چکیده
Structural indexes of XML data can effectively reduce the search space for the evaluation of path queries over the data. The indexes partition the structural graph of an XML document into equivalent classes of nodes that are then condensed into index nodes. However, structural indexes are inadequate to handle queries with valuebased conditions, since equivalent nodes in the same partition become distinguishable by their data contents. In practice, only a small portion of nodes in each partition are relevant for the processing of a value-based condition. To enhance the applicability of structural indexes, we propose a lattice structure on an XML structural index, which we call the Structure Index Tree (SIT). The index is defined as partitions of equivalent paths in an XML document, while an element in the lattice, which we call a SIT-Lattice Element (SLE), is an index of an arbitrary subset of paths in the document. Since paths represent the structure of XML data and each text node is associated with a unique path, we can define an SLE to filter out both irrelevant structures and text nodes. We propose a set of SLE operations and devise efficient techniques to generate SLEs that can be tailored towards query workloads. Our experiments show that SLEs significantly speed up the evaluation of path queries with value-based and aggregation-based conditions. We also demonstrate that SLEs are able to support effective querying over very large XML documents in memory-limited hand-held devices.
منابع مشابه
VAMANA : A High Performance, Scalable and Cost Driven XPath Engine
Many applications are migrating or beginning to make use native XML data. We anticipate that queries will emerge that emphasize the structural semantics of XML query languages like XPath and XQuery. This brings a need for an efficient query engine and database management system tailored for XML data similar to traditional relational engines. While mapping large XML documents into relational dat...
متن کاملFramework-Based Development and Evaluation of Cost-Based Native XML Query Optimization Techniques
Reflecting on the history of database management systems reveals that cost-based query optimization has been the dominating method for effectively answering complex queries on large documents. Native XML database management systems provide an efficient infrastructure for storing, indexing, and querying large XML documents. Even though such systems can choose from a huge set of structural join o...
متن کاملRelational Approach to Logical Query Optimization of XPath
To be able to handle the ever growing volumes of XML documents, effective and efficient data management solutions are needed. Managing XML data in a relational DBMS has great potential. Recently, effective relational storage schemes and index structures have been proposed as well as special-purpose join operators to speed up querying of XML data using XPath/XQuery. In this paper, we address the...
متن کاملXML Data Storage and Query Optimization in Relational Database by XPath Processing Model
XML is de facto new standard for data representation and exchanging on the web. Along with the growth of XML data, traditional relational databases support XML data processing across-the-board. Consistent storage and efficient query for XML data is the chief problem in XML supported relational databases. This work presents mechanisms of Storage and query optimization for XML data in relational ...
متن کاملEfficient query processing and index tuning using proximity scores
In the presence of growing data, the need for efficient query processing under result quality and index size control becomes more and more a challenge to search engines. We show how to use proximity scores to make query processing effective and efficient with focus on either of the optimization goals. More precisely, we make the following contributions: • We present a comprehensive comparative ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004